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Abstract. Recently Chen and Shao developed a Stein-type method to obtain bounds on the 
closeness of the distribution of a general nonlinear statistic to that of a linear approximation. 
We generalize these results so as to allow one to use lesser moment restrictions when applied to 
^ nonlinear statistics expressed as smooth enough functions of sums of independent random vectors. 

, Our main innovation in the method is the use of a Cramer-type of tilt transform. Other tech- 

' niques used to obtain improvements include exponential and Rosenthal-type inequalities for sums 

' of random vectors established by Pinelis and Sakhanenko. As applications, Berry-Esseen type 

bounds are obtained for concrete nonlinear statistics such as the Pearson correlation coefficient 
and the non-central Student and Hotelling statistics. 
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1. Introduction 



We were interested in studying certain properties of the Pitman asymptotic relative efficiency 
■ (ARE) between Pearson's, Kendall's, and Spearman's correlation coefficients. As is well known 

ijjj . (see e.g. [2^), the standard expression for the Pitman ARE is applicable when the distributions of 

' the corresponding test statistics are close to normality uniformly over a neighborhood of the null 

set of distributions. Such uniform closeness can usually be provided by Berry-Esseen (BE) type of 
bounds. 

Kendall's and Spearman's correlation coefficients are instances of ?7-statistics, for which BE 



bounds are well known; see e.g. [2j]. As for the Pearson statistic (say R), we have not been able to 
find a BE bound in the literature. 

This may not be very surprising, considering that an optimal BE bound for the somewhat similar 
(and, perhaps, somewhat simpler) Student's statistic was obtained only in 1996, by Bentkus and 
Gotze for independent identically distributed (i.i.d.) random variables (r.v.'s) and by Bentkus, 
Bloznelis and Gotze [3| in the general, non-i.i.d. case. (A necessary and sufficient condition, in 

Date: May 31, 2009; file: arxiv.tex. 

2000 Mathematics Subject Classification. Primary: 60F05, 60E15, 62F12. Secondary: 60E10, 62G10, 62G20, 
62F03, 62F05. 

Key words and phrases. Berry-Esseen bound, nonlinear statistics, Pearson's correlation coefficient, non-central 
Student's statistic, non-central HoteUing's statistic. Stein method, Cramer's tilt, exponential inequalities, Rosenthal's 
inequality . 

1 



2 



lOSIF PINELIS AND RAYMOND MOLZON 



the i.i.d. case, for the Student statistic to be asymptotically standard normal was established only 
in 1997 by Gine, Gotze and Mason fl^.) For more recent developments concerning the Student 
statistic, see e.g. the 2005 paper by Shao [43]. 

Employing such simple and standard tools as linearization together with the Chebyshev and 
Rosenthal inequalities, we quickly obtained (in the i.i.d. case) a uniform bound of the form 0(n~^/^) 
for the Pearson statistic. Indeed, Pearson's R can be expressed as f{V), a smooth nonlinear function 
of the sample mean V = ^X]r=i^' where the K^'s are independent zero- mean random vectors 
constructed based on the observations of a random sample; cf. (j4.8p . A natural approximation to 
f{V) is the linear statistic L{V) = J27=i ^(n^*); where L is the linear functional that is the first 
derivative of / at the origin. Since BE bounds for linear statistics is a well-studied subject, we are 
left with estimating the closeness between f{V) and L{V). Assuming / is smooth enough, one will 
have 1/(1^) — -^(1^)1 on the order of and so, demonstrating the smallness of this remainder 

term becomes the main problem. 

Using (instead of the mentioned Rosenthal inequality) exponential inequalities for sums of ran- 
dom vectors due to Pinelis and Sakhanenko 40] or Pinelis 34, 35], for each p £ (2,3), under the 
assumption of the finiteness of the pth moment of the norm of the 1^'s, one can obtain a uniform 
bound of the form 0(l/nP/^~^), which is similar to the BE bound for a linear statistic with a compa- 
rable moment restriction. However, the corresponding constant factor in the 0(l/n^/^~^) will then 
explode to infinity as p t 3. As for p ^ 3, this method produces bounds of order 0{{\nn)'^^^ /y^) 
(for p = 3) and 0{(hin)/ ^/n) (for p > 3), with extra logarithmic factors. Concerning this method 
and the corresponding results, see Proposition 13 . 91 in the present paper. 

While any of these bounds would have sufficed as far as the ARE is concerned, we became 
interested in obtaining an optimal-rate BE bound for the Pearson statistic. Soon after that, we 
came across the recent remarkable paper by Chen and Shao 9]. Suppose that T is any nonlinear 
statistic and W is any linear one, and let A := T — W; then make the simple observation that 

- P(z - 1 A] < < ^) sC P(T z) - P{W ^ z) < P(z < s$ z + ] A]) 

for all z G M. Chen and Shao Q offer a Stein-type method to provide relatively simple bounds on 
the two concentration probabilities in the above inequality, hence bounding the distance between 
T and W; the reader is referred e.g. to [l| for illustrations of the elegance and power of Stein's 
method to a wide array of problems. Chen and Shao provided a number of applications of their 
general results. 

However, in the applications that we desired, such as to Pearson's R, it was difficult to deal with 
A = r — W, as defined above. The simple cure applied here was to allow for any A ^ ]T — W\, so 
that, for T = f{V), W = L{V), and smooth enough /, the random variable A could be taken as 
]|F]p (up to some multiplicative constant). This allowed for a BE bound of order 0{l/y/n), though 
under the excessive moment restriction that EjlVijl"* < oo. 

To obtain a BE bound of the "optimal" order 0{l/y/n) using only the assumption E]|Vi]|'^ < oo, 
we combine the Chen-Shao technique with a Cramer-type tilt transform, which appears to be the 
most important and novel modification of the Stein- type method given in the present paper. Yet 
another modification was made by introducing a second level of truncation, to obtain a bound of 
order 0{l/nP/^~^) in the case when E]|Vi]|^ < oo for p G [2,3). As for the requirement that the 
observations be identically distributed, it may (and will) be dispensed in general; that is, V will in 
general be replaced by a sum S of independent but not necessarily identically distributed random 
vectors. 

There are two main groups of results in this paper. One is represented by Theorem 12.31 which 
provides a "non- uniform" upper bound on | P(T ^ z) — P{W ^ z)| (that is, an upper bound which 
decreases to in |z|), for a general nonlinear statistic T and a general linear statistic W; a "uniform" 
bound on | P(T ^ z)—P{W ^ z)| is given by Theorem 1 2. II The other kind of main results, obtained 
based on Theorems 12 . 1 1 and 12 . 31 is represented by Theorem 13. 51 which provides a non-imiform upper 
bound on | ¥{f{S) ^ z) — P{L{S) ^ z)|; it is the latter bound that took more of our time and effort. 
Once such a bound is established, it becomes rather straightforward to obtain the desired BE bound 
for the Pearson statistic — as well as for other similar statistics, including the non-central Student 
and Hotelling ones. 
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The paper is organized as follows. 

- In Section [21 we state and discuss the mentioned upper bounds on | P(T ^ z) — P{W ^ z)\ for 
general T and W, as well as certain other related results; in particular, in this section we provide 
an improvement (Proposition 12. 5p of a non- uniform BE bound by Osipov and Petrov for linear 
statistics. 

- In Section [31 the mentioned Theorem 13.51 and other results are stated, providing bounds on 
I P(/(5) ^ z) — P{L{S) ^ z)\; a certain optimality and other nice properties of these bounds are 
presented and discussed there. 

- Applications to several commonly used statistics, namely the non-central Student T, the Pearson 
R, and the non-central Hotelling are stated in Section [H The resulting BE bounds for these 
statistics appear to be new to the literature. 

- All proofs are deferred to Section [5l 

2. Approximation of the distributions of nonlinear statistics by the distributions 

OF linear ones 

Let Xi, . . . , Xn be independent r.v.'s with values in some measurable space X, and let T: X" — > R 
be a statistic of the random sample {Xi)'^^^. Further let 

(2.1) ^,:^g,{X,) and t^, := h,{X,) 

for i = 1, . . . , n, where gi: X — > R and hi: X ^ M are Borel-measurable functions. Assume that 

n 

EC, = for aU i = 1, . . . , n, and ^ E^,^ = 1. 
Consider the hnear statistic 

n 

(2.2) 

i=l 

Further let 5 be any real number such that 

n 

(2.3) ^E|e.|(<5A|6|) ^ i; 

i=l 

note that such a number always exists (because the limit of the left-hand side of (12. 3|) as (5 t oo is 
1). Necessarily, 5 > Q. Also consider the sum of the mixed second-third moments 

n n n 

(2.4) /3:=^E(e,2A|C.|^) =^ECfl{|e.|>l}+ElE|^'^l'l{l^»l^l}- 

i—1 i—l 2—1 

Theorem 2.1. Let A he any r.v. such that |A| ^ |T — W\. For each i — 1, . . . ^n, let Ai be any 
r.v. such that Xi and (A^, W — are independent. Then for all z ^M. 

n 

(2.5) |P(r z)-F{W ^ z)| < 4(5-l-E|VFA| + ^ E |^,(A - A,)| +P(max|77i| > l) 

i=l * 

and 

n 

(2.6) |P(r z)-F{W z)\ < 2/3-KE|TyA| +EE|e,(A- A,)| +P(max|77i| > l), 

i=l * 

where A is any r.v. such that 

(2.7) A = A on the event \ max \rii\ ^ 1 >. 
Further, for all z (z R 

n 

(2.8) |P(T z) - $(z)| ^ 6.ip + E\WA\ + ^E |C,(A - Ai)| +P(max|?7,| > l), 

1=1 ' 
where $(z) is the standard normal distribution function. 



4 



lOSIF PINELIS AND RAYMOND MOLZON 



Remark 2.2. Inequalities (|2.5p . (|2.6p . and (|2.8p are the same as ones found in Chen and Shao's 
paper Theorem 2.1], with two exceptions. In the first place, there they defined A to be equal 
to T — W. The second generalization comes from an added truncation level via the inclusion of A 
and the subsequent addition of the term P(maxi|r7i| > 1). As bounding E \£,i{T —W — Ai)\ may be 
rather cumbersome depending on the form of T — W, the first generalization allows one to choose a 
possibly larger A which would be more amenable to analysis. However, if that A should happen to 
be "too large," (i.e. if it violates some moment assumptions) the second generalization allows one 
to truncate A to within acceptable constraints. This will prove useful in the construction of the 
Berry-Esseen type bounds of Section [3l when p E [2, 3), though it should be noted that a choice of 
hi = (say) and A = T — W returns us to the original bounds in [9]. These two generalizations 
are also used in the non- uniform bounds of Theorem 12.31 below. 

Before stating the "non-miiform" extension of Theorem 12. 1[ let us introduce some notation. For 
arbitrary p G [1, oo), let 

/ \ i/p 

(2.9) ^p:= Ell^'llp ' 

where \\X\\p :— E^^^ \X\p for any real-valued r.v. X. Further, for any n-tuple {(i, . . . Xn) of real- 
valued r.v.'s, let 

n 

Gdz) :=EP(|C.|>z) 

i=l 

for arbitrary z ^ 0, where the subscript C refers to the Ci's. 

Let A denote positive absolute constants, possibly different in different instances. Similarly, let 
A{p) denote positive expressions depending only on p, also possibly different in different instances. 
Additionally let 

a b mean a ^ A(p)&, 

where a, b are nonnegative expressions; the use of this simplifying notation may sometimes result 
in a loss of information, though the information could be regained by reworking the arguments. 

Theorem 2.3. Let A be any r.v. such that |A| ^ \T — W\. For each i — 1, . . . ,n, let A^ be any 
r.v. such that Xi and (Ai,(A'j: j ^ i)) are independent, and assume that the mentioned Borel- 
measurable functions gi and hi are such that \hi\ ^ \gi\, so that ^ |?7i| almost surely (a.s). Take 
any p ^ 2 and let q := ^j^, so that ^ + — 1. Then for all z gM. 

(2.10) I p(r z) - F{w z)| sjp 7^ + Te"l^l/^ 

where 

n 

(2.11) 7. := P(|A| > ^) + Ge(^) +Y.^{\W - > ^) n\v^\ > 1), 

n 

(2.12) T (II All, + S){1 + ap) + ^Hedlpll A - A,||„ 
and A is any r.v. satisfying ()2.7p . 

Remark 2.4. As will be made clear in the proof, r in (j2.12|) could be replaced by 

n 

{\\A\\,,+S)il + ap,) + -A,\U, 

i=l 

for two different sets of conjugate numbers {pi,qi) and {p2,q2), with pi,p2 ^ 2 and pi 7^ P2 a 
distinct possibility; A{p) (suppressed by the "<p" notation) would then be replaced by A{pi,p2), 
depending only on pi and P2. 

For p = 2 (and with hi = gi and A = A = T — W), Theorem l2.3l was obtained by Chen and Shao 
[9', Theorem 2.2]. The more general form of the bound given by (|2.10p allows one to lessen moment 
restrictions. Indeed, in applications of Theorem 12.31 given in this paper - such as Theorem 13.51 - 
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one will have |A| on the order of ||5'|p and |A - A^] on the order of ||X,||2 + \\X,\\ \\S - X,\\, where 
5* := '^^^iXi and the Xi's are independent random vectors. So, using Theorem 12.31 with p — 3 
(and hence q = |) in order to obtain a bound of the classical form 0{ ^Q^^_^i^a ), one will need only 
the 3rd moments of \\Xi\\ to be finite. On the other hand, using (|2.12[) with p = 2 to get the same 
kind of bound would require the finiteness of the 4th moments of 

Bound (j2.10|) on the closeness of the distribution of the linear approximation W to that of the 
original statistic T can be complemented by the following bounds on the closeness of the distribution 
of the linear statistic W to the standard normal distribution. 

Proposition 2.5. Let p ^ 2. Then for W as in (|2.2|) . and iji as in (|2.1|) with ^ \rii\ a.s. for 
i — I, . . . ,n, and for aZ/ z G R one has 

(2.13) I ¥iW ^z)~ $(z)| B{z,p) := Bi{z) A ^2(2,^), 

where 



("4, .,,:.t<(J^)'A(JM_V 



(2.15) B,i..p) := G.,(!|^) + (^iffrjF + jl^) I{<l = l + < !)■ 

Note that the bound Bi (z) in (|2.14p was obtained in a more general form by Bikelis Theorem 4] 
(see also 33, Chapter V, Supplement 24]), and also in its present form by Chen and Shao 't', Theorem 



2.2]. The more classical non- uniform version of the Berry-Esseen inequality is implied by (|2.13p : 

I FiW ^ z) - <I>(z)| Si(z) «C p^Y)? 

when p e [2, 3]. This was also stated, for p = 3, in ^]; the case when p — 3 and the ^i's are i.i.d. is 
due to Nagaev [2^ . 

Similarly, when gi — hi (and hence = rji) for i = 1, . . . ,rt, (|2.13p and Chebyshev's inequality 
imply 

I F{W ^z)- $(z)| B,{z,p) j^-^ + 

which is a generalization and improvement of the known Osipov-Petrov theorem (see [33I Theo- 
rem 13 of Chapter V] and also Osipov 32]); that theorem was given for p ^ 3, i.i.d. ^^'s, and 
with {\z\ + 1)P in place of e'^'^^. While this latter bound may appear more familiar, the accuracy 
provided by the sum of the tail probabilities G,, in (|2.15p (rather than the sum of the absolute 
moments given by u^) shall prove useful. 

In the remainder of the paper, uniform and non-uniform bounds on the distance between the 
distributions of the nonlinear statistic T and its linear approximation W shall be stated, with the 
acknowledgement that Proposition 12.51 may be used to place a bound on the distance between the 
distribution of T and the standard normal distribution. Further, non-uniform bounds shall be 
stated for z sufficiently far away from the origin, with the understanding that the accompanying 
uniform bound may be used for the small \z\. In anticipation of the results of the next section, let 
us also state 

Corollary 2.6. // the conditions of Theorem \2.3\ are satisfied, then for all z £ R such that \z\ ^ 1, 
(2.16) 

I P(r ^ z) - nW < z)\ P(1A1 > M) + + + ^) I ) < 1}. 
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3. BeRRY-ESSEEN BOUNDS FOR NONLINEAR FUNCTIONS OF SUMS OF INDEPENDENT RANDOM 

VECTORS 

In this section, we shall use results of Section^ Assume from hereon that (X, || • ||) is a separable 
Banach space of type 2; for a definition and properties of such spaces, see e.g. [13, [33|- Let 
Xi, . . . , Xn be independent random vectors in X, with EX^ = and E||Xi|p < oo for i = 1, . . . , n, 
and also let 

n 

5:=^X„ 

i=l 

s,:=[Y,\\xx) ={Y.^\\x,r) , 

n 

Gx{z) :=5]P(||X,|| >z), 

i=l 

for any p ^ 1 and z ^ 0. Under this notation, note the assumption that X is of type 2 implies the 
existence of a constant D := D{X) > such that 

(3.1) ||5||2^I?S2. 

We shall assume that D is chosen to be minimal with respect to this property; note that D — 1 
(and there is equality in (|3.ip ) whenever X is a Hilbert space. 

Remark 3.1. The results of this section hold for vector martingales taking values in a 2-smooth 



separable Banach space; in such a case, one can apply results of [3J] instead of the ones of :40i] 



used in the present paper. By [17|, |3J], every 2-smooth Banach space is of type 2. It is known that 



LP spaces are 2-smooth, and hence of type 2, for p ^ 2 [34 , Proposition 2.1]. In particular, any 
separable Hilbert space is of type 2. 

Let next /: X ^ K be a functional with /(O) = 0, satisfying the following smoothness condition: 
there exist e > 0, M > 0, and nonzero L E X* such that 

(3.2) |/(x) - L(a;)| < Y ||a;||2 for all x G X such that s$ e. 

Thus, the continuous linear functional L necessarily coincides with the first Frechet derivative, /'(O), 
of the function / at 0. Moreover, for the smoothness condition (|3.2p to hold, it is enough that the 
second derivative f"{x) exist and be bounded (in the operator norm) by M over all x G X with 
||x|| ^ e. If X is a finite-dimensional Euclidian space, the latter condition means that the largest 
singular value of the Hessian matrix of / be bounded by M over all x e X with ||x|| ^ e. 
Then we have the following uniform Berry-Esseen type bound on f{S): 

Theorem 3.2. Let Xi, . . . , X„ be independent random vectors in X with EX^ = and E ||Xi|p < oo 
for all i = 1, . . . ,n. // / : X — > M satisfies (|3.2p and 

(3.3) 'T:=v/V^^^^=yEJUPX^S>0, 
then for all p ^ 2 and z G K one has 

(3.4) I P(M ^ _ ^ I p(||5|| > ,) + A^;;^ + Gx (^) + r. 
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where 
(3.5) 

(3.6) 

(3.7) 

(3.8) 
(3.9) 



and q :~ 



\L\\ 



(^{U^ + V^){1 + Xp) + XpXgV^ 



M L 
V — 
2 e 



V := DX2 + XI l{p < 3}, 



Remark 3.3. The term Pdl^H > e) in ()3.4|) can be bounded in a variety of ways. For instance, using 
Chebyshev's inequahty and (|3.ip . 



Il^ll>e)< 



1^1 



or alternatively 



P(||5|| >£)< 



l^ll^ 



follows by a Rosenthal- type inequality (see e.g. jsil, Theorem 2] or [ssl . Corollary 1]). A better 
upper bound can be obtained based on an appropriate exponential inequality; cf. Lemma l5.3l in the 
present paper. 

Remark 3.4. Note that m < 00 whenever Sp < 00 (or hence Xp < 00), whether p^3or2^p<3, 
while X2q may be infinite for p G [2, 3) even when Sp < 00. It is the additional truncation, with A 
instead of A, in the bounds of Section [2] that allows one to use u instead of X2q. 

The main result of this section is the following non-uniform bound: 

are satisfied, then for all p > 2 and z € M. such 



Theorem 3.5. // the conditions of Theorem 
that 



(3.10) 

one has 

(3.11) 



where 
(3.12) 
and q : 



1 ^ UK 



3Cie2 



L{S) 



C z 



< r (^rL 



{D^C^slla)P 



Gx{<j/\\L\ 
\z\P 



\z\P 

g|2|/3 



l{l#Gx(^)<l}, 



r + A9(i + Ap 



P-2- 



Remark 3.6. The cause of the restriction p.lOp is the term P(| 15*11 > e) found in the uniform bound 
()3.4p . which in turn arises because the linear approximation p.2p is assumed to hold only in an 
e-neighborhood of the origin. Essentially, one needs \z\a = 0(1), which in an i.i.d. setting translates 
to \z\ — 0{^/n) (cf. Corollarv l3.8p . Proposition 13 . 101 shows that this upper bound on \z\ is the best 
possible, up to a constant factor. 



The following corollary of Proposition 12 . 51 is to be used together with Theorems 13.21 and 13.51 
Corollary 3.7. // the conditions of Theorem \3.2\ are satisfied, then for all p ^ 2 and z G M 

'L{S) 



"(-^ ^ 2) - $(z)| B{z,p) = Bi{z) A B2{z,p), 
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where 
(3.13) 

and 
(3.14) 



Bi(z)^^E (i 



I^HI|x,| 



(- 



L\m\\ 
(ki + i) 



l|i||(f + 1) 



{\z\ + l)P 



Jz\/2 



l{(|z| + l)^'Gx( 



_£(|£| + 1)_- 

l|i||(f+l). 



<1}. 



While the expressions for the upper bounds given in Theorems 13.21 and 13.51 are quite exphcit, 
they may seem comphcated (as compared with the classical uniform and non-uniform Berry-Esseen 
bounds). However, one should realize that here there are a whole host of players: M, e, and D 
(besides such more traditional terms as Sp and a) - each with a significant and rather circumscribed 
role to play. 

One way to see this is as follows. Think of the coordinates of the random vectors Xi (in a given 
basis) as measurements in certain units, say centimeters (cm). Suppose then that the statistic f{S) 
has the dimensions of cm'^, for some d G R; that is, f{S) is measured using a unit equal to cm'^; 
let us write this as f{S) ^ cm'^. (In the applications given later in this paper one will have d = 0, 
which makes sense, as one does not want the result of a statistical test to depend on the choice 
of the units of measurement.) Then L{S) ^ cm"*, ~ cm''"^, a ^ cm"*, e ^ cm, M ~ cm'^"^, 
z ~ cm", Ci ^ cm**"^, D cm", Sp ~ cm for all p, and so, the upper bounds in l\3A\i and (|3.1ip 
are unit-free, ~ cm". 

Another nice feature of these bounds is that they do not depend on the dimension of the space 
X of type 2 (which may even be infinite-dimensional) - but only on its "smoothness" constant D. 

It is yet another nice feature that the bounds in (|3.4p and ()3.11|) do not explicitly depend on 
n. Indeed, n is irrelevant when the Xi^s are not identically distributed (because one could e.g. 
introduce any number of extra zero summands Xi). In fact, the bounds in (|3.4p and (|3.11[) remain 
valid when S is the sum of an infinite series of independent zero-mean r.v.'s, i.e. S = ^'^iXi, 
provided that the series converges in an appropriate sense; see e.g. Jain and Marcus pll |. 

On the other hand, for i.i.d. r.v.'s Xi our bounds have the correct order of magnitude in n. 
Indeed, let 



V,Vi,.. 

in X, with KV = 0. Here we shall use 



Vn be i.i.d. random vectors 



1 

rj ^ — ^ 



in place of S (and hence ^Vi in place of Xi). Then we have the following 
Corollary 3.8. // ([2^ holds and 



(3.15) 

then for all z Cz M. such that 
(3.16) 



has 



(3.17) 



fiV) 



aiU/n 



L{V) 



ai := ||L(y)||2 >0, 



, , 3(7i6 , — 
1 ^ — — Vn 



^pGviVn 



<7l\Z\ 



iC,D^Vg/a,)P 



'Gy(V^ai/|jL||) ^ r. 



,,N/3/l{l-rGv(V^||M)<i}, 
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where 



Gv{z):^n¥{\\V\\> z) forz^Q 



An 



([u" + v^){i + Ap) + \,x,v) + (mi^y (1 + A,), 



|^||2 ■ " ■ •■P--."y ■ V 

I^IIII^IU 1 



and u, V, Ci, q are as defined in Theorems \ 3.^ and \3.5[ In particular, if \\V\\p < oo, then 



for z satisfying p.l6p . 

In the i.i.d. setting, one has a uniform bound of the form 0(l/n(P'^'^^^^/^) on the distance to 
normahty of the statistic \/nf{V)/ai (again, the constant in the O(-) notation wiU depend on / 
and the distribution of V, and also assumes ||V^||p < oo). For p E (2,3), the following proposition 
provides uniform bounds of the same order, though the corresponding constant factor explodes 
to infinity as p t 3. For p — 3 the bound is of order 0((lnn)^/^/Y^) and for p > 3 it is of order 
0{(liin)/ y/n). While these rates are suboptimal forp ^ 3, for moderate values of n the bound given 
in Proposition 13. 91 mav prove to be better than the uniform bound given in Corollarv l3.81 since the 
methods used in Proposition [3?9] are less complicated and thus may result in smaller constants. 

Proposition 3.9. // ([3?2l) and (|3?T5l) hold, then for all z£R 

'A/nP^^-^ if2<p<3, 



(3.18) |p(^^^<z)-$(z)|^<jA(lnn)3/2/V7^ z/p = 3, 

'^^'"^'^ [A(lnn)/V^ ifp>3, 

where the constant A depends on p, D, f, and the distribution of V . 

The following proposition shows that the upper bound on z in (|3.16|) . and hence in p.lOp . is in 
general optimal up to a constant factor. 

Proposition 3.10. Let p > 2, X = M, and f{x) = x + x^, so that L{x) = x. Let V, Vi, . . . ,Vn's be 
real-valued symmetric i.i.d. r.v.'s with density 

\v\-P-Hn-^ \v\ 

for all \v\ ^ vq, where the real number vq and the density on {—vo,vo) are chosen so that vq > 1, 
||V^||2 — 1, and \\V\\p < oo. Then there is no sequence {z{n)) such that z{n)/^/n oo and p.l7p 
holds for all n and z = z{n). 

In the proof of Proposition 13 . 101 we shall use the following proposition. While the inequalities in 
(j3.19p are probably well-known, we shall provide a proof of Propositon 13. 1 II in Section [H 

Proposition 3.11. Let (X, ||-||) be any (not necessarily type 2) separable Banach space. Let 
Xi, . . . , X„ be independent symmetric r.v. 's in X. Then 

(3.19) P(||5|| >a;) ^ iP(max||X,|| >x) ^ i- ^» '^dl^^H > ^) 



2 1 + E.P(II^.|| >^) 
for all real x. 

When the sum of the tails, X^i I^dl^jll > 2;), is subexponential (as it is in Proposition 13. lOp . one 
actually has (in contrast with the inequalities in p.l9p ) the asymptotic equivalence IP(||S'|| > x) ^ 
yVP(||Xj|| > x) for X in an appropriate zone; here the symmetry of the X^'s is not needed. See 
|38j or [3^ and the bibliography there, or 37 1. 

Remark 3.12. Note that, in applications to problems of the asymptotic relative efficiency of statis- 
tical tests, usually it is the closeness of the distribution of the test statistic to a normal distribution 
(in M) that is needed or most convenient; in fact, as mentioned before, obtaining uniform bounds 
on such closeness was our original motivation for this work. 
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On the other hand, there have been a number of deep results on the closeness of the distribution 
of f{S), not to the standard normal distribution, but to that of f{N), where iV is a normal random 
vector with the mean and covariance matching those of S. In particular, Gotze [l5| provided an 
upper bound of the order 0(1/ y/n) on the uniform distance between the d.f.'s of the r.v.'s f{S) and 
f{N) under comparatively mild restrictions on the smoothness of /; however, the bound increases 
linearly with the dimension k of the space X (which is M'^' therein) . 



On the other hand, one should note here such results as the ones obtained by Gotze [lj| (uniform 
boimds) and Zalesskii [43,I3| (non- uniform bounds), also on the closeness of the distribution of f{S) 
to that of f{N). There (in an i.i.d. case), X can be any type 2 Banach space, but / is required to 
be at least thrice differentiable, with certain conditions on the derivatives. Moreover, Bentkus and 
Gotze [sl provide several examples showing that, in an infinite-dimensional space X, the existence 
of the first three derivatives (and the associated smoothness conditions on such derivatives) cannot 
be relaxed in general. 

4. Applications 

To illustrate the use of the Berry-Esseen bounds in Section [31 we present some bounds on the 
rate of convergence to normality for some commonly used statistics. For the sake of simplicity and 
brevity, we assume only the special case where p — 3 and the r.v.'s are i.i.d., with the understanding 
that the reader may apply the results of Section [3] in the general non-i.i.d. and/or p > 2 setting. 
To this end, let us give the following corollary, which entails some loss of accuracy but is perhaps 
somewhat easier to parse than Corollary [ 



Corollary 4.1. Let f satisfy (j3.2p . Let X be a Hilbert space, and let V,Vi, . . . , Vn be i.i.d. X-valued 
r.v.'s with¥.V = 0, cti = \\L(V)\\2 > and \\V\\3 < oo. Then for all z e R 

fiV) A N ^ / . Il^ll 



(4.1) p( ^ ^) _ $(z) 



and for all z £M. satisfying p.l6p 

fiV) 



In 



A / Ai ^ A-2. 



(4.2) f(^<.)-c,(.) ^ 



;|2|/3 



where 

(4.3) := (^i^)'II^H3 + giH^lli/" 

(4.4) M:^{^-^mi ^ \\my\\i\(. , wmwi 



and Ci is defined in (|3.7I 



In what follows, R*"' is equipped with the Euclidean norm ||-||, a vector x £ R'^' is treated as a 
fc X 1 column matrix, and a linear operator S : M*^ ^ M*^ is treated as a fc x fc matrix. There are 
two matrix norms considered, namely the Frobcnius norm 



and the spectral norm 

||i3||2 max \\Bx\\ 

\\x\\—l 

for A: X fc matrices B — (6y). Note that ||i3||2 ^ II^IIf ^ V^||S||2 for all fc x fc matrices B [l5 
4.1. Non-central Student's T. Let X,Xi, . . . ,X„ be i.i.d. real-valued r.v.'s, with 

fi:=EX and a := VYa.r X e (0,oo). 
Consider the statistic commonly referred to as Student's T (or simply T): 
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where 



n — ' n 

i=l i=l 

let T take arbitrary values when X"^ ~ x'^ just so that T remain a statistic (i.e. measurable), and 
call T "central" when ^ = and "non-central" when /i 7^ 0. Note that S is defined here as the 
empirical standard deviation of the sample {Xi)"^-^^, rather than the sample standard deviation 

^_a_(x^_X^). ^ ^ 

Note that T is invariant under the transformation Xi >—>■ aXi for arbitrary a > 0; so, let us 
assume w.l.o.g. that 

(7=1. 

Thus, if X follows a normal distribution we see that \h^^T follows Student's non-central T 
distribution with n — 1 degrees of freedom and non-centrality parameter /i. Of course, we do not 
limit ourselves to this specific case, but rather allow X to have any distribution subject to the 
moment assumptions given in Theorem 14.21 

Much work has been done rather recently concerning the distribution of the central T . Bentkus 
and Gotze [4] proved the uniform Berry-Esseen bound on the distribution of T when /i = 0: 

I P(T r^z)- $(z)| ^ AEX2 > n} + £ \x\^ is^x^ ^ 71} A 



,p/2-l 



for p € [2,3] and all z S K.; Nagaev [27j] provided explicit constants for this bound when p = 3. 
Bentkus et al. proved a uniform Berry-Esseen bound when the XiS are not necessarily i.i.d., and 
Shao [i^l provided explicit constants for this bound. See also Hall concerning the Edgeworth 
expansion of the distribution of T, Novak |3^ 31| concerning Berry-Esseen bounds on the self- 
normalized sum, Chistyakov and Gotze lllf for probabilities of moderate deviations, Shao 



41l . 14211 and Nagaev [28| for probabilities of large deviations, or Wang and Jing [46[ and Jing et 
al. |22l | for non-uniform Berry-Esseen bounds. This is of course but a sampling of the recent work 
done concerning asymptotic properties of the central T; for work done even earlier, the reader is 
referred especially to the bibliography in 4]. 

We contribute to this work by applying the results of Section [3] to T (regardless of the value of 

Theorem 4.2. Suppose that \\X\\q < 00, and also 



ai := v'E(f(X-A^)2-(X-M)-f)' > 0. 

Then for all z £ M. 

(4.6) p(^:^ <.)-$(.) ^Au + M 

V (7i / y/n V y/n 

and for all z ^ M. satisfying p.l6p with (say) ^ — \ 

(4.7) I p(Z_^^, )_$(,) 



A f Al A2 



\z\/3 



where Ai and A2 are as defined in (j4.3p and (j4.4p (again with e — M < 00 is a constant 
dependent only on /i, ||L|| = ^^4 + /i^, and 



\\v\u = \\{x- -{X- + 11117, ^ 11^ - -"112" + 11^ - -"11" + 1 



/or a e {2,3}. 



Remark 4.3. Note that if = then cti 7^ 0, and otherwise cti = if and only if X has a 2-point 
distribution. Particularly, 

l^tv ..^2 ,v ,.^ M_n„„ ^ V ,, _ 1 



fii-O ^ ^{X-ixf -{X-ti)-L- = Qa.s. ^ X-/i = -(1± v/TT7? 
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That is, (Ji = if and only if 



X = 



2v/p(l-p) 
1 - 2p 



Bp a.s., 



where Bp is a standardized Bernoulli(p) r.v. with p e (0, 1) \ {\}. 

Bentkus et al. recently showed that if ||X||4 < oo, then (after some standardization) T has 
a limit distribution which is either the standard normal distribution or the distribution with 
one degree of freedom; the latter will be the case if and only if X has the two-point distribution 
described above. 

Bounds (|4.6p and (|4.7p appear to be new for the non-central T . Bentkus et al. i5] provide a 
sufficient condition for {T — ^Jn^)/ai to converge in distribution to a standard normal r.v.; namely, 
that II^IU < oo and ai 7^ (see the previous Remark 14 . 31 concerning the degeneracy condition cri — 
0). Note that the condition ||X||4 < 00 is equivalent to || V^||2 < 00, where V = {X — ^, X^ — 1 — /i^) 
— which is what we use to derive Theorem 14.21 from Corollary 14.11 Therefore, it seems rather 
natural to require that ||T^||3 < 00 or, equivalently, \\X\\q < 00 to obtain a bound of order 0{l/y/n); 
cf. the classical Berry-Esseen bound for linear statistics, where the finiteness of the third moment 
of the summand r.v.'s is usually imposed to achieve a bound of order 0{l/\/n). 



4.2. Pearson's R. Let {X, Y), {Xi, Yi), . . . , (X„, Yn) be a sequence of i.i.d. random points in M^, 
with E(X^ -I- Y^) < 00, VarX > 0, and Vary > 0. RecaU the definition of Pearson's product- 
moment correlation coefficient: 



R := 



Y:=M,-xm-Y) 



XY -XY 



t:=,{x,-xyjy:=i{y.-y 



X2 -X%/y2 



i^x., F:^l^y, y^:=i^y/, Xy:=l^M; 



(4.8) 

where 
X : 



let us allow R to take arbitrary values if the denominator in (|4.8p is — as long as R remains a 
statistic. Note that R is invariant under all linear transformations of the form 1— > a -t- hXi and 
Yi c + dYi with positive h and d, so in what follows we may (and shall) assume w.l.o.g. that the 
r.v.'s X and Y are standardized: 

EX = Ey = and X'^ = "RY"^ = I . 

We then have the following non-uniform bound on the rate of convergence of the statistic R to 
normality: 



Theorem 4.4. Let \\X\ 
and also 

Then, for all z € R, 
(4.9) 



|r||6 < 00, 

X-EXY-EY „ 
p:=E . . = E XY, 

VVarX VVaFy 



(Ti := 



R- p 



{XY - ^{X^ + Y^)) >0. 



and, for all z G M. satisfying (j3.16p (with e — 

R- p 



\ n V 



(4.10) 



\v\\l 



,|2|/3 
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where Ai and A2 are defined in (|4.3p and (j4.4p (again with e — M < 00 is a constant dependent 
only on p, \\L\\ = y^l + and 

\\V\U = 11^' +Y^ + {X^ - If + _ 1)2 + (xy _ ^)2||i/2 

< + \\Y\U + \\x^ i|U + \\Y^ - i|U + ll^r - p\u 

for a e {2,3}. 

Remark 4.5. Note that the degeneracy condition cri = is equivalent to the foUowing: there exists 
some K e M such that the random point (X, Y) hes a.s. on the union of the two straight hues 
through the origin with slopes k and 1/k (for k = 0, these two lines should be understood as the 
two coordinate axes in the plane K^). Indeed, if ai — 0, then XY — ^{X^ + Y'^) = a.s.; solving 
this equation for the slope Y/X, one obtains two roots, whose product is 1. Vice versa, if (X, Y) Hes 
a.s. on the union of the two lines through the origin with slopes k and then XY = §(^^ +1^^) 
a.s. for r := 2k/ {k^ + 1) and, moreover, r ^¥.^{X'^ + Y'^) = ¥.XY = p. 

For example, let the random point {X,Y) equal (ex, kcx), {—cx^—kcx), (Kcy^cy), {—Kcy, —cy) 



with probabilities §, §, respectively, where a; ^ 0, y ^ 0, k G M, c := 

y2 

p := — -, and (7 := 1 — p; then (Ti = (and the r.v.'s X and K are standardized). In particular, 

+ y2 

one can take here a; = y = 1, so that p = q ~ 5. 

The bounds in (|4.9p and (|4.10p appear to be new. In fact, we have not been able to find 
in the literature any uniform (or non-uniform) bound on the closeness of the distribution of R 
to normality. Note that such bounds are important in considerations of the asymptotic relative 
efficiency of statistical tests; see e.g. Noether 29]. Shen [44] recently provided results concerning 
probabilities of large deviation for R in the special case when {X,Y) is a bivariate normal r.v. 
Formal asymptotic expansions for the density of R follow from the paper by KoUo and Ruul j23|. 

4.3. Non-central Hotelling's statistic. Let fc ^ 2 be a positive integer, and let X, Xi, . . . , X„ 
be i.i.d. r.v.'s in M*-\ with E < 00, 

/i := EX, and E := Cov X = E,XX~^ — pip^ positive definite. 

Consider Hotelling's statistic 

(4.11) := X^ (S"^ ln)-^lC = n'X^ iXX^ - 'XX^V^^, 



where 

X:^-Y,x,, Yx^ -.^ -Y^x^x] , s"" ■.= -Y,{x,~x){x,-xy = Yx^ -xx"; 

i—1 i—1 i—1 

note that the generalized inverse is often used in place of the inverse in (|4.1ip . though here we 
may allow to take any value whenever is singular — as long as remains a statistic. Note 
that is defined as the empirical covariance matrix of the sample {Xi)f^^, rather than the sample 
covariance matrix -^S'^. Call "central" when u = and "non-central" otherwise. 

n— 1 ' 

For any nonsingular matrix i?, is invariant under the invertible transformation Xi ^ BXi\ 
particularly, letting B = Yr^f^ allows us to assume w.l.o.g. that 

CovX = /, 

the k X k identity matrix. 

The form of in (j4.1ip is easily seen to yield to a Berry-Esseen type bound via Corollary 14. H 
being a function of the two sums of random vectors X and XX^. 

Theorem 4.6. Assume that \\X\\e < 00 and 



ai := \/E{{{X - pYpY - 2{X - pY p - p^p) > 0. 
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Then for all z G M. 

(4.12) f(^:^^^z)-h.) + m 

and for z G K. satisfying (j3.16p (with ^ — \, say) 

where A\ and A2 are defined in (j4.3p anc? (|4.4p (again, with e — M < 00 is a constant dependent 
only on ||L|| ^ ||/i|| ■\/4 + and 

\\v\u = \\\\x - - iix - Mir + kfj/, < \\x - A^iiL + 11^ - miu + Vfc 

for a & {2,3}. 

Remark 4.7. The non-degeneracy condition ci > immediately implies that /i 7^ 0, so that 
Theorem 14.61 is applicable only to the non-central T^. If /i 7^ 0, then ai = if and only if 



{X-fi^fi = l±yTTI/iFa.s., that is, if and only i{P{X'^^i = xi) = l-P{X'^^i = X2) = p, where 

in other words, ci = if and only if X lies a.s. in the two hyperplanes defined by X~^ fj. — xi or 
X^/x = X2- Note the similarity to the degeneracy condition of T described in Remark l4.3l Recalling 
the conditions EX = fi and Cov X = I , we have cti = if and only if 



X^£,-^+X a.s.. 



where 



e = V + Bp some p e (0, i), 



2 v/p(l-p ) 
1 - 2p 

and X is a random vector in R''' such that KX = 0, E^X — 0, X^/i = a.s., and CovX is the 
orthoprojector onto the hyperplane {/x}^ :— {x g M'^ : ^ — 0}. 

Again, the bounds in (j4.12p and (j4.13p appear to be new; indeed, we have found no mention of 
Berry-Esseen bounds for in the literature. Probabilities of moderate and large deviations for the 
central Hotelling statistic (when /i = 0) are considered by Dembo and Shao [T^. Asymptotic 
expansions for the generalized distribution for normal populations were given by Ito [101 (for 
M = 0), and by Ito }^^, Siotani and Muirhead [2^ (for any /n). 

5. Proofs 

Proofs of all theorems, propositions and corollaries stated in the previous sections are provided 
here. 

5.1. Proofs of results from Section [2l 

Proof of Theorem \2.1\ As noted in Remark 12.21 the assertions of Theorem 12.11 are very similar to 
those of d, Theorem 2.1]. From the condition that |A| ^ |T - (cf. ^ (5.1)]) 

(5.1) - P(z - |A| ^ < z) s$ P(r z) - ¥{W z) < P(z < IF < z -f |A|) 

for arbitrary z S M. Replace every instance of A in the proof of Theorem 2.1] (from (5.2)] 
and thereafter) with A; this action proves that 

n 

P(z ^ < z I A|) 4(5 E |iyA| ^ E |^,(A - A^)]. 



Recalling the condition (|2.7[) on A, one has 

P(z s; z -f |A|) P(z < z -I- |A|) -I- P(max|r;,| > l) 
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Then P(z— | A| ^ ^ z) is bounded in a similar fashion, using z— | A| in place of z. Inequality (|2.5p 
then follows from (|5.1[) ; inequalities (|2.6[) and (|2.8p follow from (|2.5[) and the first few arguments in 
the proof of 9, Theorem 2.1]. □ 

Proof of Theorem \2.3[ The proof of Theorem l2.3l largelv follows the lines of that of 9, Theorem 2.2] . 
The extension to p other than 2 is obtained using a Cramer-tilt absolutely continuous transformation 
of measure along with Rosenthal's inequality. Assume w.l.o.g. that z ^ 0, and let 

n 

(5.2) ^ 1} and W ■.^Y.'^,- 

1=1 

Recalling that ^ \rii\ a.s. and also the condition (|2.7p . one has 

P(z- |A| i^W^z) P(|A| > ^) +P(z- |A| < z, |A| ^ ^) 

(5.3) < P(|A| > ^) + V{z -\A\^W^z, |A| < ^,max|77,| > l) 



(z-|AKM/^z,|AK^), 



and further 
(5.4) 



|AK W^^z,|A| < 2±i,max|7y,| > l) ^J^H^ ^ > l) 

n n 



i=l i=l 

Next, replace every instance of A in the statement and proof of 0, Lemma 5.2] as well as 
(2.8)] with A. After making this replacement, there are two inequalities which need modification. 
First, fi', (5.21)] is modified to the following: 



I? 



X:E|e,e(^-«.)/2(A - A,)| E 11^^^^"^-''^% IP - 

n 

(5.5) =Y.^'/'>ei(^-~^-^\M,\\A-A,\\, 

n 

^exp{i(e''/2-l-f)}5]liedlp||A-A,l|,; 

i=l 

the last step above comes from [9, (5.15)]. 
The final change is to [9, (5.22)]: 

(5.6) E\W\e^/^{\A\+2S) (1| AJI, + 25) HM^e^/^H^; 

Chen and Shao [§] were able to bound KW^e^ (corresponding to the case p = 2) with an absolute 



constant; here, more work is required to bound the last term in ()5.6p for the general p. Specifically, 
we apply Cramer's tilt to the ^i's. 

For any c > 0, let ^ := (^i, . . . ,^„), | := (Ji, . . . ,C,i), and let ^ =: (|i, ...,!„) be a random 
vector such that 

P £ € A) = ^4= 

for all Borel sets A E M"; note that the ^i's are independent r.v.'s. Further, if /: K" ^ M is any 
nonnegative Borel function, then 

(5.7) E/(0^ . 
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Next, 

Ee, = m 1} = - EC, m^ >i}^-E^f^ -i, 

so that Jensen's inequality yields 
and further 

with J2i^& ^ 6^^^ ^ consequence of this. Also, je'^^' — 1| ^ cl^Je'^ ^ c|^i|e'^, so that 
lE^.e'^^-l = I E^,(e=«' - l) | s: E|C.||e^«' - l| ce^E^' 



and 





EC.e^«> 




Ee^«. 



^ce^^E^f, 



implying E^i| ^ ce . Let now c 2 Rosenthal's inequality (see e.g. [34|, Theorem 5.2]) yields 



(5.8) ( ( m - E r) + ( E(|. - E if) + f 



Letting f{xi, . . . ,x„) = | J2i^i\^ ™ (|5.7p and using [9, (5.15)] again, one has 

(5.9) _ 

II ^e^^llp = ( E I I = ( E E I E. r) < exp { i (e^/^ - 1 - f ) } II 1. 1 
Hence, combining (|5.6p . (|5.9p and (|5.8p . we have 

(5.10) E|T4^|e^/2(|A|+25) {\\A\\, + S){1 + ap). 

Replacing inequaHties 0, (5.21) and (5.22)] with (|5.5p and (|5.10p . respectively, shows that 

(5.11) F{z - I AK "W^ z, I A| < ^) r e'^/^, 

with T as defined in (|2T^ (cf. 0, (5.14) and (2.8)]). 

Combining (|5.3p . (|5.4p . and (|5.1ip and recalling the definition ()2.1ip of 7^ yields 

P{z-\A\^W ^z) <p7.+Te^"/'; 

in a similar fahsion, one shows P(z ^ ^ z + |A|) 7^ +r e~^/'^. Referring back to (|5.ip finishes 
the proof. □ 



We shall prove Proposition 12.51 bv using a result of [8[, based on an appropriate modification of 
Stein's method, along with the following corollary of an exponential inequality due to Hoeffding: 
for all z and t > 0, 

/i /t 

(5.12) p(p^^,)^^p(^,>t) + (_^y ^G^{t)+[^y ; 

i—l 

this can be easily obtained by truncation from e.g. [sj, Theorem 8.2] (recall that (72 = 1). 

Proof of Proposition [275[ Assume w.l.o.g. that z ^ 0, and suppose first that (z + l)PG^(l^) > 1. 
Then, by (jST^ with < := fyj, 



> ^) ^ (iv^) + ( i + ^z(z+i) ) ^ (fn) + ( (7w) 

(5.13) p V ^ 2 V ; 



p 

z + l\ 1 ^/z + 1 



^P Q j + ^p G,yj 
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One also has 

which hiiplies \ F{W ^ z) - <^{z)\ <p -82(2, p), and so ([213]) holds. 

It remains to consider the case when (z + l)^G^(-fq-j) < 1. As in the proof of fs', Theorem 6.4], 
write 

< F{W > z)~ P{W > z) ^ P{W > z, max^, > 1), 

i 

with W as defined in (|5.2p . Next, for z ^ |, one has z — -^j^^ ^ 0, and 



>z,maxe. > 1) ^^P(6 > +^p(w^-e. >^-|^)P(6 > 1) 



2 ' \ 2 



-{z + iy 



p^nf + iy ^ (z + i)p' 

with the second inequality following by (15.12p with t := 4xT (and reasoning similar to (|5.13p ) and 
the third inequality by the case condition and the assumption that ^ |7yj| a.s. for i = 1, . . . ,n. 
If z < |, then ¥{W - > z - s^p (z + 1)"^ holds trivially and one still has ¥{W > 

(^+1) 

\F{W z) - $(z)| < A 



z, maxi S,i > 1) G,-i{-w^) + (^fi^- It remains to refer to inequality (6.15) in Chen and Shao 



_\z\/2- 

□ 

Proof of Corollarv \2.(A The proof is quite similar to that of Proposition 12.51 Assume w.l.o.g. that 
z ^ 1, and consider first the case when zPG^(|^) ^ 1. Then 

|P(T > z)-¥{W > z)\ < P(T > z) + P(T4^ > z) ^ P(|A| > f) +f{W > + Y{W > z). 

Recalling that ^ \rii\ a.s., so that ^ G^, use the case condition to obtain (similarly to (|5.13p 
and choosing < := || in (|5.12p ). 

Y{W >z)^ V{W > f ) Ge(lf) + G,(|), 

which proves the assertion in this case. 

In the alternative case, when z^Grji^) < 1, we use Theorem 12.31 The terms of 7^ as defined in 
(|2.1ip are bounded below: 

P(|A| > £±i) s^P(|A| > f), 
G,(^)^Ge(f)^G,(|)^G,(|), 

G,(l). 



5^P(|M^-^.| > ^)P(|77,| >i) s;p 



zP 



the second line above uses the fact that p ^ 2, and the third line is trivial if 1 ^ z < 2 and otherwise 
follows similarly to (|5.14p (using t := |^ in (|5.12p and the case condition). Recalling (|2.10p yields 
the result (|2.16p for this final case. □ 

5.2. Proofs of results from Section [3j The uniform and non- uniform Berry-Esseen type bounds 
of Theorems l3.2l and l3.5l relv on the corresponding bounds of Section[2l Let / be a function satisfying 
(13. 2p . and also let Xi, . . . , Xn be independent, zero-mean X- valued random vectors. For i ~ 1, . . . , n, 
let 

g^ix):^^ and /^.(a:) MM 
a a 
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for a; G X, SO that, in accordance with (|2.ip . 



= ^ and ^,^K{X,)=™^' 



a a 

where a = ||i(S')||2 is as defined in (\3.'S\i . RecaUing the definitions (|2.9p and p.5p . it is easy to see 
that 

(5.15) a„ ^ {E^mrf'' ^ ^ {E^nxrf'' = a,,. 

Also, by Chebyshev's inequality, 

(5.16) Q(.) ^ G,(.) = G.(^) ^ (Mi^)" ^ (^)" 
for arbitrary a ^ 1 and z > 0. Next, let 

a a 

i 

and also 

(5.17) f := r I{||S'|| ^ e}. 
Finally, let 

(5.18) A:=— 

(T 



Then, by JS^) and (i377|l . 

|f - M/| = a-'(\f{S) - L{S)\ 1{\\S\\ < 6} + \L{S)\ 1{\\S\\ > 

<.-(fvM)||5f.A. 
Adopt some more notation: 

(5.19) X,:^ Xa{\m\^^] and 5:=^1„ 

and then let 

(5.20) A ^ (\\Sfl{p ^ 3} + \\S\\H{p < 3} 
and 



A, := —(\\S~X,fl{p ^3} + \\S~ X4H{p < 3} 

(7 \ 



With all of this notation in mind, note that the assumptions of Theorems 12.11 and 12.31 are satisfied 
for the nonlinear statistic T (in place of T) and its linear approximation W; particularly, E^^ = 0, 
VarH^ = 1, |A| > \ f - W\, ^ryi,A satisfies (|2J)) and A^ satisifes the condition that Xi and 
(Ai, {Xj : j ^ i)) are independent (which further implies Xi and (A^, W — ^i) are independent). 



Lemma 5.1. If the conditions of Theorem \3.S\ are satisfied, then 



Lemma 5.2. If the conditions of Theorem \3.2\ are satisfied, then 

n n ^ , 

|e.(A ~ A,)| < 5]||e.||p|| A - A.ll, ^ \{u^ + v\). 

i=l i=l " ' 



Lemma 5.3. If the conditions of Theorem \3. 51 are satisfied, then 
(5.21) Pdl^ll >e)^ F{\\S\\ >x)^ Gx[^) + j^, 
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where 



(5.22) x:=\ -^ and Ai := 12epCi 

y 36i a 

The proofs of these lemmata are deferred to the end of this subsection. 

Proof of Theorem \3.2\ Take any z € R. In view of (I5.17P and applying (|2.6p from Theorem 12.11 
(5.23) 

I P(r ^z)- f{W s$ z)| PdlS"!! > e) + I P(f s$ z) - f{W s$ z)\ 

n 

(I051 a) < PdlS-ll > e) + 2/3 + E|W^A| +^E|^,(A- A,)| +P(max|77,| > l). 

i=l ' 

Recalling (|2.4p and using the inequality A \x\^ ^ jxl^^"^ for a; £ M and p ^ 2, and further using 
(|5.15p . one has 

(5.24) a < ^ A^;^^. 

Also, 

(5.25) P(max|77,| > l) < G^(l) = Gx(^). 

Using Rosenthal's inequality (see, e.g. [13, Theorem 5.2]) and recalling that (t| — VarM^ — 1, 
and so, 

(5.26) E \WA\ ^ \\WU-K\U <P ^ + + Ap) 

by Lemma [5. II It remains to refer to (|5.23lap - (|5.26p and Lemma [5T^ □ 



Proof of Theorem\EIE Assume w.l.o.g. that z ^ 1. By and ([^1^ . 

(5.27) 

I P(T ^ z)-P(H^ z)| P(|15|l > .)+P(|A| > f ) +G,(|) + (^ + -1^) I {z^'G.d) < l}. 

The definition (|5T8l) of A implies P(|A| > f ) = ^{\\S\\ > x), where x is as in (jO^ . Lemma[53] 
then implies 

,5.8, niisii > + P(|A| > 5) (I;) + (i^^ . (^) . ffi!^i£l/^£, 

because ^ ^5 63^777 follows by the condition (|3.10p on z. Note that ||L|| ^ Cie by (|3.7p . whence 
(5-29) gJ^)=Gx(-^)<g4/^ ^ 



also, 

(5.30) G,(l) = Gx(^). 

By [i, Remark 2.1] and ((5T5)) . 

" '^p ~ <^p ^ ^pi 

hence, recalling the definition (|2.12p of r and also the definition (|3.12p (see also p.6p ) of Fi, 
(5.31) 

r=(||A||, + 5)(l + ap) + ^||^,||p||A-A,||,^p l^((^.2+t;2)(i + Ap) + ApA,z;)+A|(l + Ap) = ri 

z III 

follows from Lemmas ISH and [5?2l and also (|5T5l) . Collecting (|5?28l) - (|OT|) into (|5?27|) finishes the 
proof. □ 
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Proof of Corollary\K]\ Recalling the definition H^l^ ofBi{z), (|XT^ follows since = \L{Xi)\/a 
|lL|||lXi||/CT for i = 1, . . . ,n. Recalling the definition ((2J5| of -82(2, p), ((3ll)) wiU follow by (jSTS)) 
and (jg^fel) . □ 

Proof of CoroUary\J^ Let X, :== ^V,, so that 5 = J2'Li = V- Then 



2_^..T7^2_"l „ _ ||i||||^.|| „ _ ni 



n ' \/na\ ' \/na\ ^Z" ' 

and 

GxH = nP{\\V\\ > nw) = Gv{nw) 
for any a ^ 1 and w > 0. The resuh (l3J6l) - ([3l7ll now follows from (l3T^ - (l3lI|) . □ 

Proof of ProvosiUon\3.9l Assume w.l.o.g. that z ^ 0. Let A := /(F) - L(V) and 6 > he 
some quantity dependent on n (and perhaps other constants associated with the distribution V), 
unspecified for the moment. Then (cf. (j5.ip ) 



Vo-i/Vn / Jn ) 



A\\L(V\\\^'^'^ 

(5^32) ^'^'' + -'' + „,,L-«™:f" +''<^l^l^''^) 

where the second inequality follows from the classical Berry-Esseen bound (or using (j2.14p ) Sim- 
ilarly one bounds P( ^'^J^ ^ z) from below. So, it remains to bound P(y^|A| > (Ti(5), for an 



appropriate choice of 8. Let S :— Then, by 

(5.33) P(V^|A| >ai5) ^P{\\V\\ > e) +¥(^V^ f\\Vf > aiS) +P(|1^|1 >x), 
where 

Next take any ?; > and let Vi^y := Vi I{|| < y} and Sy :— J2^=i ^i,v Note that 

(5.34) E||5„|| \\Sy-ESyh + \\ESy\\ i^2DV^\\Vh+'-^ ^ f; 

the last inequality here will hold by an appropriate choice of S and y (or , equivalently, x and y), 
to be made at the end of this proof. Using the exponential inequality j40l . Corollary 2] along with 
Chebyshev's inequality, one has 

F{\\S\\ > x)^ P(max||F,|| > y) + P(||5,|| >x)^\\V\\p^+ F{\\Sy\\ - E||5,|| > f ) 



y ^ xy 

Combining (|5.32p . (|5.33p and (|5.35p . and solving for S in terms of x, one obtains 



ne^ ^ yP V xy 



If p 3, let y :— e\\V\\2\/n/ Inn and choose S so that x = 2e\\V\\2\/nlnn. If p G (2,3), take 
X = 2e\\V\\2n'^^~P'>^'^ and y = e\\V\\2^/n- Then for large enough n one has the last inequality of 
(|5.34p . as well as p.lSp : if n is not large, then (|3.18p is trivial. □ 

If / = f{n) > and g — g{n) > are sequences of real numbers, let us use the following standard 
definitions: 

/ = o{g) lim = 
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and 

/f(n) g(n)\ 
n^oo ^g{n) i[n)J 

Proof of Proposition\ME Let S -.^V, T f{S)/a = ^{S + S^), W := L{S)/a = y^S and 
A :— T ~ W — \/nS'^ (note that Ci may be taken to be 1 by choosing e ^ 1, so that (|5.18p holds). 
Throughout this proof, let C stand for various positive constants which do not depend on n. 
To obtain a contradiction, assume that (|3.17p holds for some sequence z — z{n) ^ 1 such that 

(5.36) K:—z/^/n^oo; 
further let 

(5.37) w n3/4zi/2 ^ ^1/2^^ 

SO that w/n — k^^'^ —> oo. Note that, for v > vq^ the tail probabihty of V is 

(5.38) nV>v)^ / ^^^^^^ 

Jv u^^ in u ti'' In w 

which follows by I'Hospital's rule. Consider the various terms on the right-hand side of inequality 
(I3T7II . Now ((536)l - (l08ll imply 

(5.39) _^l^^<:^^^=o{nw-P-') = o{nnV>w)), 
(5.40) 

(CiD2||F||?/CTi)f Cn ( n \ ( n \ ( n \ , 

nP/^\z\P wPukp/^ \wPln{nKP^^)J \wPln{nK)^ \wP\nwJ ^ ' 

(5.41) GviV^^^)- = X = oinnV > w)) 

"^V GpCieJ {^z)P\n^(y/^z) wPKP/^\n^{nK) kp/^wpIu^w ^ ' 

and 
(5.42) 

Gv{\/nai/\\L\\) n n 



o{nV(y > w)). 



\z\P [^/nz)P\n^n vjPnP/'^hi'n ^wP In' nln'n^ ^wPln^(nK) 

Collecting the terms ((09l) - (l02)) of ((3?T7)) shows 

(5.43) |P(r s$ z) -P(VK s$ z)\ = o(nP(y > w)) . 
Next, if p ^ 3 then ||y||3 < 00, and (refer back to ^J^) 

(5.44) B,{z,p) ^ Gv^(V^ ||^||(H + i) J + (|z| + l)P + V^eN/^ = "^^^^^ > 

by jSZIl), (IPS]) and ([QQ]) . Otherwise, if p e (2,3) then \\V\\p < 00, and (see ([XTS]) ) 

1 /-^^ , 1 /-"^ v^-P , 

i3i(z) X / — 5— au H — ^ / — 5— aw. 

Consider the first term above: 

1 /-^^ v^~P , 1 (JTizf-P n n , , ^^ 



\n^v yfnz^ In^(^z) nP/^zP In^(nz) wPkp/2 In^(nz) 

Similarly, 

-1.1 -^dv^^^^^-^ = — 2 ^o{nV{V>w)) 

z J^z In V In (nz) nP"zP In (nz) 

so that 

(5.45) Bi{z) ^ o{nP{V > w)). 
Hence and ([^1^ imply 

(5.46) |P(M^<z)-$(z)| ==o(nP(y >u;)). 
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Also, 

(5.47) 1 - = o(e-^'/2) = o{nF{V > w)). 

Thus, (|5.43p . (|5.46p . and (|5.47p . along with the symmetry of the distribution of V, imply 

(5.48) P(A > 2z) P(r > z)+P{-W > z) = P{T > z)+¥{W > z) = o{nP{V > w)) . 
On the other hand. Proposition 13.11] implies 

P(A > 2z)=F{V^S^ > 2z) = P{\J:^V,\ > V2w) ^ 

where 

8 := nP(y > \/2w). 

Recalling that w/n = k^I"^ oo and the tail probability (|5.38p . eventually 8 ^ nPiy > \/2n) 
nn~P \n~^ n — *■ 0, whence 

P(A > 2z) ^ f = i nP(y > V2w) ^ ^nV{V>w) 
for large enough n, which contradicts (|5.48p . □ 

Proof of Provosition Introduce r.v.'s Tj := S — 2Xj (obtained from S by flipping the sign of 

Xj) and the disjoint events Aj {H-'^ill ^ x, . . . , \\Xj^i\\ ^ x, \\Xj\\ > x} for j — 1, . . . ,n. Then 
Xj = iS'— iTj, and so, ll^'^jH ^ 5ll'5'|| + ^||7j||. Hence, the occurrence of event A^- implies that either 
US'!! > xor \\Tj\\ > X. It follows that P(ylj) P(Aj; US']! > x)+f{Aj-\\Tj\\ > x) = 2P(Aj;||S'|| > x), 
by the symmetry. Summing now in j, one has P(maxi \\Xi\\ > x) — '}2,jF{Aj) ^ 2P(||S'|| > x), so 
that the first inequality in (|3.19p is proved. 

To prove the second one, observe that f{Aj) = P(||Xj|| > x) P(maxj<j \\Xi\\ x) ^ ^{\\Xj\\ > x) 
xP(maxj \\X^\\ s$ x). Summing now in j, onehasP(maXj \\Xi\\ > x) ^ Zlil^dl^ill > a;) P(maxi \\X^\\ 
x), from which the second inequality in p.l9p follows. □ 

Proof of Lemma \5.1[ Suppose first that p ^ 3, so that, in accordance with (|5.20p . A = ^||S'|p. 
Using the Rosenthal- type inequality [1^ Corollary 1] with (|3.ip and recalling the definitions (|3.5p . 
p.8p . and (|3.9p of Aq,, u, and v, respectively, 

IIAII, = ^ ll^ll^, ^{s% + D^sl) ^ |^(A^, + D^Xl) ^ ^^{u^ + v% 

which proves the lemma for p ^ 3. 

Now suppose that 2 < p < 3. We have, by two applications of Holder's inequality and (|5.16p . 

, , p5||-||E.EXj{h,|>i}||«;E.E|l^.l|i{h.l>i} 

(5.49) , , , 

< E,ll^.||pl|I{kl > ^ 5^6,(1)1/" ^ s,A^/^ = ^ AP. 

Let 

Xi:=Xi-¥.X, and 5* := ^ X, = ^ - E 5, 

i 

SO that for all a ^ 1 

(5.50) \\X,\\l ^ 2"-i(||l,||^ + ||EX,|r) «C 2"||Xi||« 

by Jensen's inequality. With (|5.49p and (|5.50p in mind, and using [3^, Corollary 1] with (|3.ip . 

^||A||, = ||^||^^<2(||5|!^, + ||E^||2) 



where in the penultimate inequality the definition (j5.19p is used. This completes the proof of the 
lemma. □ 
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Proof of Lemma 



Suppose first that p ^ 3. Then, for i — I, . 



,n, 



A- A,l = 



a 



\\sf-\\s-x. 



\\S\\ - \\S-X,\\ ||5|| + II5-X 



||X,|| +2||5-X,|| = 



IX 



2||Xi|| IIS — Xi 



whence 
(5.51) 



|A- Aill, 



|x.||^, + 2||x,|U|l5-x,|lJ 



IX 



■2q 



2(7 



v\\X, 



411? ; J 



since \\S - XJ, i^\\S- X^Ha Ds2 = crv/\\L\\ by ^ and fSH). Thus, 



^E|C.(A-A,)| < ^11611,11 A -A, 
1=1 1=1 

Ci 



1 1 2 I 'lav 



< — dp (S2g + JL^ VSq) S; \p [U + 2vXq) , 



where Holder's inequahty is used in the first and third inequalities, and (j5.15p and the definitions 
p.5p . p.Sp . (|3.9|) are used in the last. This proves the lemma for p ^ 3. 

Suppose then that p £ [2, 3). Similarly to (|5.5ip and using the truncation in the definition (|5.19p . 



Ci 



A - A.ll, ^ -(IIX.II^, + 2||X,||,||5 - X.ll,) ^ ^( ( - )^-^/^||X 



P/9 



2||X,||,||5-X,||2 



also using p.ip and (|5.49p . and reasoning as in (|5.50p . one has 
Thus, 



S - X,||2 \\S - X,||2 + ||E5 - EX.II sjp Ds2 + ^Xl^^ v. 



5^E|6(A-A,)| ^5^E||6||p||A-A,|U 



i=l 



i—l 



\\L\\'- 



The lemma is thus proved for p e [2,3) as well. □ 

Proof of Lemma\5M W.l.o.g. 2^1. That P(||5|| > e) ^ P(||5|| > x) follows from x < e, which 
follows by the condition (|3.10p on \z\. Write 

(5.52) P(||5|| ^ x) < Gx{y) + nWSyW > x), 

where y := x/{2p), Sy := Y.2^i,y' ^^^^ Xi^y := XjI{||Xi|| < y}. Note that 

lliE^.ll = ||E.^a{iix,ii>y}||^i^|, 

where the last inequality is equivalent to inequality ^ ^ 1, with A2 := 2ApCiS2/(J < Ai, and 
1 follows because w.l.o.g. ^ ^ 1 (since otherwise the right-hand side of inequality l|5.2ip is 



greater than 1). Let Xi^y :— X^ 



~ EXi^j^ and Sy := J^i ^i.y^ ^^at (|3.ip iniphes 

E||4|K||5,||2«;^S2^^, 
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with this last inequality equivalent to ^ ^ 1, with A3 :— A'BtCiD'^ s\/ a ^ Ai (as p ^ 2). Thus, 
using the exponential inequality of [40l Corollary 2], 

P(||5,|| ^x) «;P(|15,||-E||5,|| >x-E||4||-||ES,||) 
^P(||5,||-E||5,|| ^f) 

^ /2ei^y/''^^_ A£ 
^ \ xy ) zP' 

Now it remains to recall (|5.52p . □ 
5.3. Proofs of results from Section |4l 

Proof of Corollary 14- 1\ Recall the various simplifications with notation in the i.i.d. case (see the 
proof of Corollarv l3.8|) . Then 



R( A f {C^envU , (l|i^lini3M)S ^ A (A, A, 

follows by (|3.14p . using Chebyshev's inequality on the tail probabilities Gy, and also recalling 
11^1! ^ Cie. Similarly, recalling that D = 1 (as X is a Hilbert space) and using Lyapounov's 
inequality ^ W^Wp whenever < a ^ /3, the right-hand side of (|3.17|) is bounded by 

A ( Ai r. 



where 



||^lp-((Ai + A^)(l + A3) + A3A.A3/2) + (1 + A3) 

CiA„.n2n , H.nHT.n /_ x , „.n„T.„3/. A , / II ^11 II ^lU \ ^ 



^ ^-i(||T/|l^(i + IlilinUM) + IlLlini^M) + ( " 'H'^ (1 + Ililllli^lUM) 

^ a(| \\v\\l + (WMU)-^) (1 WMU) ^ 

The result (|4.2p now follows upon combining p.l7p and (|2.13p . Concerning (|4.ip . use Theorem [ 
and similar arguments as above; note that P(||S'|| > e) ^ ll^lli/l'^E^) ^-s per Remark 13.31 □ 



Proof of Theorem \4-2\ Let X = M^, and consider the X- valued r.v.'s defined as 

_ 1 " _ _ 

V, = {X,^H,iX,-^if-l) and T/ = - ^ - (X - /i, ^2 - 2/xX + /i^ - l) . 

1=1 

Next let / : X ^ M be defined by 

fix) ^ f{xi,X2) = ^1+^ 



^/X2 + 1- xj 

for X = {xi,X2) e X with ||a;|| ^ 5, so that X2 + l-xl ^ j. Note that /(O) = 0, L{x) := /'(0)(a;) = 



2:1 ~ 2 ^2, so that ||L|| = Y 1 + and also 

||L(Fi)||2 = ||(X - m) - f ((X - A^)' - 1) = llf - m)' -{X-^^)-t\\^= a,. 
Recalling the form of T in (|4.5p . one has 

/(F) _t-V7Im 



on the event ^ 5}. So, by CoroUarv 14.11 it need only be verified that / satisfies 

that ||/"(a;)|| is uniformly bounded over all a; e X such that ||cc|| ^ i for some e > 0, which is 
obvious. □ 
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Proof of Theorem \4-4\ Let X = M^, and define the X- valued r.v.'s 



1 " 

{X,,Y„Xf -1,X,Y,~ p) and V ^ -^.^^ ^ (X ,Y , - 1,Y^ - 1, XY - p) . 

^ i=i 

Next let / : X ^ M be defined by 

X5 + p — XlX2 



f{x) = f{xi,X2,X3,X4,X5) : = 



y/x3 + I — xly/x4 + 1 — xl 

for X — {xi,X2,X3,X4, X5) € X with ||a;|| ^ i. Then /(O) = and L{x) :— f'{0){x) — X5 — ^{x3+X4), 



so that = y 1 + and also 

||i(V^i)||2 = \\XY-p-§iX^ + Y^-2)\\^ = \\XY-§iX^ + Y^)\\^^ai. 
Recalling the form of R in (|4.8p . one has 

f{V) R-p 



on the event {11^11 ^ |}, and, in view of Corollary 14.11 it remains to verify that / satisfies the 
smoothness condition (|3.2p . which is obvious. □ 

Proof of Theorem \4-6\ Identify X — R'=+'=(fc+i)/2 with the set of ordered pairs x = {xi,X2), where 
Xi € M*"' and X2 is a symmetric k x k matrix over K; equip X with the norm 

^J\\xiP + \\X2\\1 = ^JxJxi+tT{xl) 
for X e X, so that X is a Hilbert space with this norm. Then let 

— 1 " 

V^ (X, - p, {X, - fi) {X, - iif - I) and 1/ = - V T^. 

j=i 

Note that 

\\v£^\\x,^^l\\'-\\x,-p\\^ + k. 

Next let / : X ^ M be defined by 

f{x) = {xi + pY{I + X2 - XixJ) ^{xi + p.) - p^p 

ioT X = {xi,X2) € Xwith ||a;|| < i, so that ||/+X2~2;ix|||2 ^ 1— ||a;2||2— ||2;ix|||2 ^ 1— ||a;|| — ||a;|p > 
i and ||(/ + X2 -xixT)"i|j2 4. 

Using the obvious identity {B + - B^^ = -{B + A)"i(S + A - BjB^^, one sees that 

the deriviative of the nonlinear operator B i—y B~^ at a "point" B is the linear operator A 1-^ 
-B-^AB-^, where the "point" B is any nonsingular matrix. Hence, the second derivative f"{x) 
is bounded over all a; g X with ||x|| ^ i, and L(x) — f'{0){x) ~ 2xjp~ pJ X2P for all x e X. Then 

ll^ll ^A\\p\\^ + \\pr = \\p\yA+\\p\\\ and also 

\\L{V^)\\2 = \\2{X-p)'p-p''{{X-p){X-pY-I)p\\^ = \\{{X - pf pf -2{X- p)" p- p'' p\\^ = a,. 
It remains to refer to Corollarv l4.1l □ 
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